Using Sentence Similarity Measure for Plagiarism Source Retrieval
نویسندگان
چکیده
This paper describes a method that was implemented in the software submitted to PAN 2014 competition for the source retrieval task. For generating queries we use the most important noun phrases and words of sentences selected from a given suspicious document. To download documents that are likely to be sources of plagiarism we employ a sentence similarity measure.
منابع مشابه
Rule Based Plagiarism Detection using Information Retrieval - Notebook for PAN at CLEF 2011
This paper reports about the development of a Plagiarism detection system as a part of the Plagiarism detection task in PAN 2011. The external plagiarism detection problem has been solved with the help of Nutch, an open source Information Retrieval (IR) system. The system contains three phases – knowledge preparation, candidate retrieval and plagiarism detection. From the source documents, know...
متن کاملPersian Plagiarism Detection Using Sentence Correlations
This report explains our Persian plagiarism detection system which we used to submit our run to Persian PlagDet competition at FIRE 2016. The system was constructed through four main stages. First is pre-processing and tokenization. Second is constructing a corpus of sentences from combination of source and suspicious document pair. Each sentence considered to be a document and represented as a...
متن کاملAdaptive Algorithm for Plagiarism Detection: The Best-Performing Approach at PAN 2014 Text Alignment Competition
The task of (monolingual) text alignment consists in finding similar text fragments between two given documents. It has applications in plagiarism detection, detection of text reuse, author identification, authoring aid, and information retrieval, to mention only a few. We describe our approach to the text alignment subtask of the plagiarism detection competition at PAN 2014, which resulted in ...
متن کاملA Winning Approach to Text Alignment for Text Reuse Detection at PAN 2014
The task of (monolingual) text alignment consists in finding similar text fragments between two given documents. It has applications in plagiarism detection, detection of text reuse, author identification, authoring aid, and information retrieval, to mention only a few. We describe our approach to the text alignment subtask at PAN 2014 plagiarism detection competition. Our method relies on a se...
متن کاملPlagiarism Detection Using Information Retrieval and Similarity Measures Based on Image Processing Techniques - Lab Report for PAN at CLEF 2010
This paper describes the Barcelona Media Innovation Center participation in the 2nd International Competition on Plagiarism Detection. Particularly, our system focused on the external plagiarism detection task, which assumes the source documents are available. We present a two-step a approach. In the first step of our method, we build an information retrieval system based on Solr/Lucene, segmen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014